class: center, middle, inverse, title-slide .title[ # ADIM: Bayesian Computation and Mixed models ] .subtitle[ ## Master’s Degree in Data Analysis, Process Improvement and Decision Support Engineering ] .author[ ###
Joaquín Martínez-Minaya, 2024-12-09
VAlencia BAyesian Research Group
Statistical Modeling Ecology Group
Grupo de Ingeniería Estadística Multivariante
] .date[ ###
jmarmin@eio.upv.es
] --- class: inverse, center, middle, animated, slideInRight # Motivation example --- # Heart Disease .left-column9[ - The study examines the relationship between: - the .hlb[myocardial infarction] (MI): `\(y = 1\)` if MI occurrence, or `\(y = 0\)` if No MI occurrence; and - .hlb[Age60]: Patients aged `\(\geq 60\)` (1) versus `\(< 60\)` (0). - .hlb[Systolic blood pressure (SBP140)]: SBP `\(\geq 140\)` mmHg (1) versus `\(< 140\)` mmHg (0). - .hlbred[Objective]: - Evaluate the association of age60 and sbp140 with MI probability. - Interpret the odds ratio (OR) for both predictors. ] -- .right-column9[ Table: Summary of Data from Study | y | age60 | sbp140 | |:-:|:-----:|:------:| | 0 | <60 | >=140 | | 0 | >=60 | <140 | | 0 | <60 | >=140 | | 0 | >=60 | >=140 | | 0 | >=60 | <140 | | 1 | <60 | <140 | ] --- # Bayesian Logistic Regression Model - .hlbpurple[Logistic regression] is used to model MI's probability based on age60 and sbp140. - .hlb[Likelihood] $$ y_i \sim \text{Bernoulli}(\pi_i)\,, i= 1, \ldots, 400\,, $$ using logit link: $$ logit(\pi_i) = \beta_0 + \beta_1 age60_i + \beta_2 sbp140_i$$ - .hlb[Prior distributions] (weakly-informative): \begin{eqnarray} \beta_0 & \sim & \mathcal{N}(0, 10^3), \nonumber \\ \beta_1 & \sim & \mathcal{N}(0, 10^3), \nonumber \\ \beta_2 & \sim & \mathcal{N}(0, 10^3), \nonumber \end{eqnarray} -- .hlbred[Note: There are no conjugate priors available for the logistic regression model.] --- #Table of contents ## 1. Bayesian computation. MCMC methods ## 2. Bayesian Software for MCMC ## 3. Hierarchical Bayesian models --- class: center, middle, animated, rotateInUpRight, inverse # 1. Bayesian computation. MCMC methods --- # Monte Carlo Methods .left-column9[ .hlbpurple[Monte Carlo Simulation] - Draw .hlb[realizations of a random variable] for which only its density function is (fully or partially) known. ``` r x <- rnorm(1000, mean = 1, sd = 2) ``` <!-- --> ] -- .right-column9[ .hlbpurple[Monte Carlo Integration] - Computing the mean of a N(1, 2), - `\(E(X) = \int_{-\infty}^{\infty} x \cdot f(x) dx\)` - Using .hlb[Monte Carlo integration]: - Simulate from N(1, 2^2): `\(\phi^1, \ldots, \phi^N\)`. - Compute the mean of the simulated values: `\(E(X) \approx \frac{1}{N} \sum_{i=1}^N \phi^i\)` - Doing `summary` of the simulation, we compute more measures: | Min.| 1st Qu.| Median| Mean| 3rd Qu.| Max.| |------:|-------:|------:|-----:|-------:|-----:| | -6.441| -0.38| 0.939| 0.965| 2.32| 7.714| ] --- # Markov Chain Monte Carlo - A Markov chain is a .hlbred[stochastic sequence of numbers] where each value in the sequence depends only upon the last. -- - If `\(\phi^1, \phi^2, \ldots, \phi^N\)` is a sequence of numbers, then `\(\phi^2\)` is only a function of `\(\phi^1\)`, `\(\phi^3\)` of `\(\phi^2\)`, etc. -- - Under certain conditions, the distribution over the states of the .hlb[Markov chain] will .hlb[converge to a stationary distribution]. -- - The .hlb[stationary distribution is independent of the initial starting values] specified for the chains. -- - AIM: construct a Markov chain such that .hlb[the stationary distribution is equal to the posterior distribution] `\(p(\theta \mid x)\)`. -- - We combine Markov Chain with Monte Carlo simulation --> .hlbpurple[Markov chain Monte Carlo (MCMC)]. -- - They were proposed by first time in the Statistics area by <a href = "https://www.tandfonline.com/doi/abs/10.1080/01621459.1990.10476213"> Gelfand and Smith (1990) </a>. --- # Posterior distribution ## Estimating the probability to score a penalty ### - .hlb[Likelihood] $$p(\boldsymbol{y} \mid \pi) = \pi^{k}(1-\pi)^{N-k} $$ -- ### - .hlb[Prior distribution] $$p(\pi) = \pi^{a-1}(1-\pi)^{b-1} $$ -- ### - .hlb[Posterior distribution] `$$p(\pi \mid \boldsymbol{y}) \propto p(\boldsymbol{y} \mid \pi) \times p(\pi) \propto \pi^{k + a - 1}(1 - \pi)^{N - k + b - 1} = p^*(\pi)$$` --- # MCMC: Metropolis-Hastings (MH) 1. Starting value `\(\pi^{(0)}\)` 2. For `\(t = 1, \ldots, T\)` - .hlb[We define a proposal distribution] (Usually similar to the objective distribution). In this case, `\(q(\pi \mid \pi^{(t-1)}) \sim logit-N(\pi^{(t-1)}, \sigma = 0.5)\)`. .hlb[Simulate] `\(\pi^{(prop)}\)` from it. - Compute .hlbred[probability of acceptance]: `$$\alpha = \text{min}\left(1, \frac{p^*(\pi^{(prop)})q(\pi^{(t-1)} \mid \pi^{(prop)})}{p^*(\pi^{(t-1)})q(\pi^{(prop)} \mid \pi^{(t-1)})}\right)$$` - Generate a .hlb[random number] `\(u\)` from the Uniform(0, 1). - `\(\pi^{(t+1)} = \pi^{(prop)}\)`, if `\(u \geq \alpha\)`, - `\(\pi^{(t+1)} = \pi^{(t)}\)`, if `\(u< \alpha\)` 3. Finally, we .hlb[obtain] `\(\pi^{0}, \pi^{1}, \ldots, \pi^{T}\)` which is .hlb[a simulation of the posterior distribution]. --- # MCMC: Metropolis-Hastings (MH) .center[  ] --- # Approaching probability of score using MH .left-column9[ .hlbpurple[Visual Metropolis-Hastings]
- Play the video from .hlb[minute 4:44]. ] -- .right-column9[ .hlb[Tracing the chain. Is the chain autocorrelated?]  ] --- # MCMC. Burnin and thin .left-column9[  ] .right-column9[   ] --- class: center, middle, animated, rotateInUpRight, inverse # 2. Bayesian Software --- # Software .left-column9[ - <a href = "https://mcmc-jags.sourceforge.io/"> JAGS </a> - <a href="https://mc-stan.org/"> Stan </a> - <a href = "https://paulbuerkner.com/brms/"> brms </a>: allow for easy Bayesian Inference using Hamiltonian Monte Carlo - <a href="https://www.r-inla.org/"> INLA </a> (it does not use MCMC methods, but it is a very powerful tool) - <a href = "https://inlabru-org.github.io/inlabru/"> inlabru </a>: facilitate Bayesian Inference in Spatio-temporal models - <a href= "https://cran.r-project.org/web/packages/MCMCpack/index.html"> MCMC pack </a> (With this R-package, you can use MCMC methods with similar notation as usually use in R) - <a href="https://r-nimble.org/"> Nimble </a> ] .right-column9[  ] --- # Bayesian Logistic Regression using JAGS .left-column8[ - .hlb[Likelihood] $$ y_i \sim \text{Bernoulli}(\pi_i)\,, i= 1, \ldots, 400\,, $$ using logit link: $$ logit(\pi_i) = \beta_0 + \beta_1 age60_i + \beta_2 sbp140_i$$ - .hlb[Prior distributions] (weakly-informative): `$$\beta_0 \sim \mathcal{N}(0, 10^3),$$` `$$\beta_1 \sim \mathcal{N}(0, 10^3),$$` `$$\beta_2 \sim \mathcal{N}(0, 10^3).$$` ] .right-column9[ ``` r model_string <- " model { for (i in 1:N) { y[i] ~ dbern(pi[i]) logit(pi[i]) <- beta0 + beta1 * age60[i] + beta2 * sbp140[i] } # Priors for regression coefficients beta0 ~ dnorm(0, 0.001) beta1 ~ dnorm(0, 0.001) beta2 ~ dnorm(0, 0.001) }" ``` Check `S1-JAGS-heart_attack.Rmd` for the complete solution ] --- # Bayesian Logistic Regression using brms .left-column8[ - .hlb[Likelihood] $$ y_i \sim \text{Bernoulli}(\pi_i)\,, i= 1, \ldots, 400\,, $$ using logit link: $$ \text{logit}(\pi_i) = \beta_0 + \beta_1 \, \text{age60}_i + \beta_2 \, \text{sbp140}_i $$ - .hlb[Prior distributions] (weakly-informative): `$$\beta_0 \sim \mathcal{N}(0, 10^3),$$` `$$\beta_1 \sim \mathcal{N}(0, 10^3),$$` `$$\beta_2 \sim \mathcal{N}(0, 10^3).$$` ] .right-column9[ ``` r formula <- bf(y ~ age60 + sbp140, family = bernoulli()) # Fit the model using brms fit_brms <- brm(formula, data = data_hattack, prior = priors, chains = 3, # Number of MCMC chains iter = 5000, # Total number of iterations per chain warmup = 1000, # Number of iterations for warm-up thin = 1, # Thinning interval seed = 123, # Seed for reproducibility ) # Summary of the fitted model summary(fit) ``` Check `S1-brms-heart_attack.Rmd` for the complete solution ] --- # Exercise: Predicting diabetes .left-column9[ Diabetes in .hlb[Pima Indian women aged 21 and older]. The dataset, contains diagnostic medical features and an additional zone variable (`zona1` to `zona30`), which categorizes patients geographically (Areas has been simulated). .hlbred[Aim:] <div style="float: right; margin-right: 20px; width: 50%;"> <img src="data:image/png;base64,#images/diabetes.png" width="100%" /> </div> Examine the relationships between medical predictors and the likelihood of diabetes. ] .right-column9[The dataset includes features such as: - **Pregnancies**: Number of pregnancies. - **Glucose**: Plasma glucose levels. - **BloodPressure**: Diastolic blood pressure (mm Hg). - **SkinThickness**: Triceps skin fold thickness (mm). - **Insulin**: 2-hour serum insulin `\(\mu U/ml\)`. - **BMI**: Body mass index `\(kg/m^2\)`. - **DiabetesPedigreeFunction**: Family history risk score. - **Age**: Age in years. - **Zone**: Zone of residence, from `zona1` to `zona30`. - **Outcome**: Indicator of diabetes diagnosis (0 = No, 1 = Yes). ] --- # Exercise: Predicting diabetes ``` r # Load dataset data_diab <- readxl::read_excel("../data/diabetes.xlsx") kable(head(data_diab)) ``` | Pregnancies| Glucose| BloodPressure| SkinThickness| Insulin| BMI| DiabetesPedigreeFunction| Age| Outcome| |-----------:|-------:|-------------:|-------------:|--------:|----:|------------------------:|---:|-------:| | 6.000000| 148| 72| 35.00000| 155.5482| 33.6| 0.627| 50| 1| | 1.000000| 85| 66| 29.00000| 155.5482| 26.6| 0.351| 31| 0| | 8.000000| 183| 64| 29.15342| 155.5482| 23.3| 0.672| 32| 1| | 1.000000| 89| 66| 23.00000| 94.0000| 28.1| 0.167| 21| 0| | 4.494673| 137| 40| 35.00000| 168.0000| 43.1| 2.288| 33| 1| | 5.000000| 116| 74| 29.15342| 155.5482| 25.6| 0.201| 30| 0| --- # Predicting diabetes: Bayesian GLM .hlb[Logistic regression] models the probability of diabetes based on selected predictors. - .hlb[Likelihood] $$ y_i \sim \text{Bernoulli}(\pi_i), \quad i = 1, \ldots, n, $$ using a logit link: $$ \text{logit}(\pi_i) = \beta_0 + \beta_1 \, \text{Glucose}_i + \beta_2 \, \text{BMI}_i + \beta_3 \, \text{Age}_i. $$ - .hlb[Prior distributions] (weakly-informative): \begin{eqnarray} \beta_0 & \sim & \mathcal{N}(0, 10^3), \nonumber \\ \beta_1 & \sim & \mathcal{N}(0, 10^3), \nonumber \\ \beta_2 & \sim & \mathcal{N}(0, 10^3), \nonumber \\ \beta_3 & \sim & \mathcal{N}(0, 10^3). \nonumber \end{eqnarray} --- class: inverse, center, middle, animated, rotateInUpLeft # 3. Hierarchical Bayesian Models --- # Example: Scoring penalties Valencia C. F. .left-column9[ - Liga Santander is one of the famous league around the world. In this example, we use data of the last 10 seasons in order to know the chance of success `\((\pi)\)` to score a penalty for .hlb[Valencia Club de Fútbol]. ] .right-column9[  ] --- # Again we talk about football ## - We consider same experiment in .hlb[10 different teams] -- ## - How can we model this situation? and what can we conclude? -- ## - More generally, how can we incorporate .hlb[random effects]? --- # Three ways to do so ### 1. .hlbred[All teams have the same characteristics]. .left-column9[ - Apply a .hlbred[joint analysis] to all the teams. - The probability of score a penalty `\((\pi)\)` is the .hlbred[same in all teams]. - Observations are independent and identically distributed. `$$y_i \mid \pi \sim \text{Ber}(\pi)$$` `$$\pi \sim \text{Beta}(a,b)\,, \text{ with a and b fixed}$$` ] .right-column9[ .center[  ] ] --- # Three ways to do so ### 2. .hlbred[Each team is different and has nothing in common with the others]. .left-column9[ - Apply an analysis to .hlbred[each team separately]. - Assume a .hlbred[different proportion of presence] in each one: `\(\pi_j\)`, `\(j=1, \ldots, J\)`. In this case, `\(J = 10\)`. - Observations are independent but are .hlb[distributed differently in each team]. - .hlb[Likelihood] is different for each team. For each `\(j\)` `$$y_{ij} \mid \pi_j \sim \text{Ber}(\pi_j)$$` `$$\pi_j \sim \text{Beta}(a_j, b_j)\,, \text{ with } a_j \text{ and } b_j \text{ fixed}$$` ] .right-column9[ .center[  ] ] --- # In view of the two possible modelings ### - Is it reasonable to assume .hlbred[the same proportion of presence] in all teams? -- ### - There are reasons to suggest that .hlbred[there is variability in those proportions]: - The teams do not behave the same way. - The observations of the same team are more similar among themselves than when they are from different teams. -- ### - Is it reasonable to think that .hlbred[there is no relationship between the proportions of presence] of the different teams? ## Although not identical, .hlbred[teams] are at least .hlb[similar]. --- # Three ways to do so .left-column7[ ### 3. Consider a .hlbred[hierarchical model]. - The parametric vector `\(\boldsymbol{\pi} = \pi_1, \ldots, \pi_{J}\)` is a .hlbred[random sampling from a common distribution] that depends on a vector of .hlbred[hyperparameters], `\(\alpha\)` and `\(\beta\)`, partial or totally unknown. - The model `$$y_{ij} \mid \pi_j \sim \text{Ber}(\pi_j) \ , j= 1, \ldots, J$$` $$\pi_j \sim \text{Beta}(\alpha, \beta) $$ $$\alpha \sim p(a), \beta \sim p(b) $$ ] .right-column9[  ] --- # Predicting diabetes: Bayesian GLM .hlbpurple[Logistic regression] models the probability of diabetes based on selected predictors. - .hlb[Likelihood] $$ y_i \sim \text{Bernoulli}(\pi_i), \quad i = 1, \ldots, n, $$ using a logit link: $$ \text{logit}(\pi_i) = \beta_0 + \beta_1 \, \text{Glucose}_i + \beta_2 \, \text{BMI}_i + \beta_3 \, \text{Age}_i. $$ - .hlb[Prior distributions] (weakly-informative): \begin{eqnarray} \beta_0 & \sim & \mathcal{N}(0, 10^3), \nonumber \\ \beta_1 & \sim & \mathcal{N}(0, 10^3), \nonumber \\ \beta_2 & \sim & \mathcal{N}(0, 10^3), \nonumber \\ \beta_3 & \sim & \mathcal{N}(0, 10^3). \nonumber \end{eqnarray} -- - .hlbred[Geographical zone: Additional variability is captured by including the patient's zone (`zona1` to `zona30`) as a random effect.] --- # Diabetes. Bayesian Mixed GLMs .hlbpurple[Logistic regression] models the probability of diabetes, including a random effect for geographical zones to account for variability. - .hlb[Likelihood] `\begin{eqnarray} y_i & \sim & \text{Bernoulli}(\pi_i), \quad i = 1, \ldots, n, \nonumber \\ \text{logit}(\pi_i) & = & \beta_0 + \beta_1 \, \text{Glucose}_i + \beta_2 \, \text{BMI}_i + \beta_3 \, \text{Age}_i + u_{\text{zone}[i]}. \nonumber \end{eqnarray}` - .hlb[Prior distributions] (weakly-informative): `\begin{eqnarray} \beta_j & \sim & \mathcal{N}(0, 10^3), \, j=0, \ldots, 3 \nonumber \\ u_{\text{zone}[i]} & \sim & \mathcal{N}(0, \sigma_u^{2}), \end{eqnarray}` - .hlb[Hyperparameters]: The precision `\(\sigma_u\)` is modeled to capture the degree of heterogeneity across geographical zones (`zona1` to `zona30`). `\begin{eqnarray} \sigma & \sim & \text{Cauchy}(0, 2). \nonumber \end{eqnarray}` --- # Diabetes using `brms` ``` r # Define formula with a random effect for zone formula <- bf(Outcome ~ Glucose + BMI + Age + (1 | zone), family = bernoulli()) # Set priors priors <- c( prior(normal(0, 1000), class = "b"), # Priors for fixed effects prior(cauchy(0, 2), class = "sd", group = "zone"), # Prior for random effect SD prior(normal(0, 1000), class = "Intercept") # Prior for intercept ) # Fit the model using brms fit_brms <- brm( formula = formula, data = data_diab, prior = priors, chains = 4, iter = 4000, warmup = 1000) ``` --- <!-- --- --> <!-- # Numerical approaches --> <!-- - When applying Bayesian Statistics, most of the usual models do not yield analytic expressions for neither the posterior nor the predictive posterior distributions. --> <!-- -- --> <!-- - Most of the .hlb[complications that appear in the Bayesian methodology] come from the resolution of integrals that appear when applying the learning process: --> <!-- - The normalization constant of the posterior distribution, --> <!-- - moments and quantiles of the posterior, --> <!-- - credible regions, probabilities in the contrasts, etc. --> <!-- -- --> <!-- ### Solutions: --> <!-- - .hlb[Monte Carlo methods: MCMC]. --> <!-- - .hlb[INLA]. --> --- # What we have learned so far - ### The biggest challenge for Bayesian inference has always been the .hlb[computational power] that more complex models require. -- - ### .hlb[MCMC methods allow computationally estimating posterior distributions] that are analytically intractable. -- - ### Today, there are many, many teams of researchers developing computational techniques for computing posterior distributions. --- # References .hlb[Books] - .hlb[Stan & R]: Lambert, B. (2018). A Student’s Guide to Bayesian Statistics. SAGE Publications. - .hlb[JAGS]: Plummer, M. (2019). JAGS User Manual Version 4.3.0. - .hlb[OpenBUGS]: Cowles, M. K. (2013). Applied Bayesian statistics: with R and OpenBUGS examples (Vol. 98). Springer Science & Business Media. - .hlb[WinBUGS]: Ntzoufras, I. (2011). Bayesian modeling using WinBUGS. John Wiley & Sons. - .hlb[Stan]: Andrew Gelman, John B. Carlin, Hal S. Stern, David B. Dunson, Aki Vehtari, Donald B. Rubin (2013). Bayesian Data Analysis. Chapman and Hall/CRC - .hlb[INLA]: Gómez-Rubio, V. (2020). Bayesian inference with INLA. CRC Press. <!-- - .hlb[Disease mapping]: Martínez-Beneito, M. A., & Botella-Rocamora, P. (2019). Disease mapping: from foundations to multidimensional modeling. CRC Press. --> <!-- - .hlb[Bayesian modelling]: Congdon, P. (2007). Bayesian statistical modelling. John Wiley & Sons. --> --- #References .hlb[Blogs] - http://wlm.userweb.mwn.de/R/wlmRcoda.htm - https://rpubs.com/FJRubio/IntroMCMC - https://darrenjw.wordpress.com/tag/mcmc/ - https://www.tweag.io/blog/2019-10-25-mcmc-intro1/ --- class: inverse, left, middle, animated, rotateInUpLeft </br> # ADIM: ADIM: Bayesian Computation and Mixed models ## Master's Degree in Data Analysis, Process Improvement and Decision Support Engineering </br> <font size="6"> Joaquín Martínez-Minaya, 2024-12-09 </font> </br> <a href="http://vabar.es/" style="color:white;" > VAlencia BAyesian Research Group </a> </br> <a href="https://smeg-bayes.org/ " style="color:white;"> Statistical Modeling Ecology Group </a> </br> <a href="https://giem.blogs.upv.es/" style="color:white;"> Grupo de Ingeniería Estadística Multivariante </a> </br> <a href="jmarmin@eio.upv.es" style="color:white;"> jmarmin@eio.upv.es </a>